2. Consuming CHANGES
You will now implement a modified version of the uspUpdateBooks in Example 5 that uses the new INSERT OVER DML syntax to capture and store historical data for updates that are appended to the Book table.
Example 5. Using MERGE with INSERT OVER DML.
CREATE PROCEDURE uspUpdateBooks AS
BEGIN
INSERT INTO Book(ISBN, Price, Shelf, ArchivedAt)
SELECT ISBN, Price, Shelf, GETDATE() FROM
(MERGE Book AS B
USING WeeklyChange AS WC
ON B.ISBN = WC.ISBN AND B.ArchivedAt IS NULL
WHEN MATCHED AND (B.Price <> WC.Price OR B.Shelf <> WC.Shelf) THEN
UPDATE SET Price = WC.Price, Shelf = WC.Shelf
WHEN NOT MATCHED THEN
INSERT VALUES(WC.ISBN, WC.Price, WC.Shelf, NULL)
OUTPUT $action, WC.ISBN, Deleted.Price, Deleted.Shelf
) CHANGES(RowAction, ISBN, Price, Shelf)
WHERE RowAction = 'UPDATE';
END
Because there is historical data in the Book table that is irrelevant for the merge operation, criteria has been added to the join predicate after the ON keyword that tests for NULL values in the ArchivedAt column. This tells MERGE to consider only current books as matching target rows against the source of weekly changes and to completely ignore all the archived history records.
The CHANGES keyword is what makes INSERT OVER DML possible. CHANGES exposes the columns of the OUTPUT clause defined on the inner MERGE statement to the WHERE clause of the outer INSERT INTO…SELECT statement. In Example 5, this includes the virtual $action column, the ISBN number, and the old values for price and shelf being replaced by the update operation.
By exposing the virtual $action through the CHANGES keyword as RowAction, the INSERT INTO…SELECT statement can apply a WHERE clause to filter out the actions of newly inserted books and append only the actions of changed books back into the Book table. As explained, this is something that cannot be achieved using the OUTPUT…INTO clause on the MERGE statement itself, but is possible using this very specific INSERT OVER DML syntax.
The code filters out insert actions with the simple criteria WHERE RowAction = ‘UPDATE’.
You could of course apply even more sophisticated logic than that if
you wanted to. For example, you may wish to save history data for
certain users, regions, or any other changed data columns returned by
the OUTPUT clause and exposed by the CHANGES keyword. And that’s exactly the key to the power of INSERT OVER DML.
Walk
through the scenario step by step. Start with two books (A and B) and
one price change (book A from $100 to $110), as shown here:
INSERT INTO Book VALUES('A', 100, 1, NULL)
INSERT INTO Book VALUES('B', 200, 2, NULL)
INSERT INTO WeeklyChange VALUES('A', 110, 1)
SELECT * FROM Book
SELECT * FROM WeeklyChange
GO
ISBN Price Shelf ArchivedAt
---- ----- ----- -----------------------
A 100 1 NULL
B 200 2 NULL
(2 row(s) affected)
ISBN Price Shelf
---- ----- -----
A 110 1
(1 row(s) affected)
When you execute uspUpdateBooks, the inner MERGE statement will update the price for book A in the Book table and send its original values to the OUTPUT clause. The OUTPUT clause values are then consumed by the outer INSERT INTO…SELECT statement via the CHANGES keyword. The outer statement can therefore use those original book values for inserting historical data back into the Book table with an ArchivedAt value set to the current server date and time (just before 5:00 PM on 2/25/2012, in this example), as follows:
EXEC uspUpdateBooks
SELECT * FROM Book
GO
ISBN Price Shelf ArchivedAt
---- ----- ----- ---------------------------
A 110 1 NULL
A 100 1 2012-02-25 16:57:19.8600000
B 200 2 NULL
(3 row(s) affected)
You can see the current data for book A (the row with an ArchivedAt date of NULL)
at $110, and you also see the previous data for book A, which was
changed from $100 at about 5:00 PM on 2/25/2012. Sometime later, you
receive a new set of changes. This time, book A has changed from shelf 1 to shelf 6, and a new book C has been added, as shown here:
DELETE FROM WeeklyChange
INSERT INTO WeeklyChange VALUES('A', 110, 6)
INSERT INTO WeeklyChange VALUES('C', 300, 3)
GO
Just like the first time you ran the stored procedure, the
current row for book A is updated, and a snapshot of the previous
contents of the row is added to the table with the current server date
and time, as shown here:
EXEC uspUpdateBooks
SELECT * FROM Book
GO
ISBN Price Shelf ArchivedAt
---- ----- ----- ---------------------------
A 110 6 NULL
A 100 1 2012-02-25 16:57:19.8600000
A 110 1 2012-02-25 16:58:36.1900000
B 200 2 NULL
C 300 3 NULL
(5 row(s) affected)
Now book A has two history records showing the values saved from two earlier updates identified by date and time values in the ArchivedAt column. There is also a new row for book C, which was inserted by the WHEN NOT MATCHED clause of the MERGE statement.
However, notice that there is no history row for book C, because the insert action that was actually captured by the MERGE statement’s OUTPUT clause was subsequently filtered out in the WHERE clause of the outer INSERT INTO…SELECT statement. Had you not filtered out insert actions in that WHERE clause, another history record would have also been added to the table for book C with meaningless NULL values for both Price and Shelf. Therefore, by filtering OUTPUT actions by using INSERT
OVER DML, you avoid the proliferation of history rows that would get
generated with each new book. You also eliminated the extra columns for
the new values that were used in the previous example’s BookHistory table, as they are stored in each updated version archived in the Book table itself. And you eliminated the need to store an Action
column, because only update actions are being captured. In the end, a
lot of benefit was derived by eliminating needless storage, and it was
all accomplished with a single INSERT OVER DML statement.
As you can see, the combination of the MERGE and INSERT OVER DML features in SQL Server is a very powerful addition to T-SQL. A fully loaded MERGE (handling inserts, updates, and deletes) wrapped up using
INSERT OVER DML delivers an enormous amount of functionality in a
single, manageable statement. We recommend using these new features in
lieu of the traditional multi-statement approaches in whatever future
development scenarios you find it possible to do so. In addition to
improved performance, you’ll appreciate the greater manageability that
results from consolidating multiple statements into one.